Multi-Label Informed Feature Selection
نویسندگان
چکیده
Multi-label learning has been extensively studied in the area of bioinformatics, information retrieval, multimedia annotation, etc. In multi-label learning, each instance is associated with multiple interdependent class labels, the label information can be noisy and incomplete. In addition, multi-labeled data often has high-dimensional noisy, irrelevant and redundant features. As an effective data preprocessing step, feature selection has shown its effectiveness to prepare high-dimensional data for numerous data mining and machine learning tasks. Most of existing multi-label feature selection algorithms either boil down to solve multiple singlelabeled feature selection problems or directly make use of the flawed labels. Therefore, they may not be able to find discriminative features that are shared by multiple labels. In this paper, we propose a novel multi-label informed feature selection framework MIFS, which exploits label correlations to select discriminative features across multiple labels. Specifically, to reduce the negative effects of imperfect label information in finding label correlations, we decompose the multi-label information into a low-dimensional space and then employ the reduced space to steer the feature selection process. Empirical studies on real-world datasets demonstrate the effectiveness and efficiency of the proposed framework.
منابع مشابه
MLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection
Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...
متن کاملExploiting Multi-Label Information for Noise Resilient Feature Selection
In conventional supervised learning paradigm, each data instance is associated with one single class label. Multi-label learning differs in the way that data instances may belong to multiple concepts simultaneously, which naturally appear in a variety of high impact domains, ranging from bioinformatics, information retrieval to multimedia analysis. It targets to leverage the multiple label info...
متن کاملMutual Information-based multi-label feature selection using interaction information
Multi-label feature selection is regarded as one of the most promising techniques that can be used to maximize the efficacy and efficiency of multi-label classification. However, because multi-label feature selection algorithms must consider multiple labels concurrently, the task is more difficult than singlelabel feature selection tasks. In this paper, we propose the Mutual Information-based m...
متن کاملEfficient Multi-Label Feature Selection Using Entropy-Based Label Selection
Abstract: Multi-label feature selection is designed to select a subset of features according to their importance to multiple labels. This task can be achieved by ranking the dependencies of features and selecting the features with the highest rankings. In a multi-label feature selection problem, the algorithm may be faced with a dataset containing a large number of labels. Because the computati...
متن کاملA New Genetic Algorithm for Multi-Label Correlation-Based Feature Selection
This paper proposes a new Genetic Algorithm for Multi-Label Correlation-Based Feature Selection (GA-ML-CFS). This GA performs a global search in the space of candidate feature subsets, in order to select a high-quality feature subset that is used by a multi-label classification algorithm – in this work, the Multi-Label k-NN algorithm. We compare the results of GA-ML-CFS with the results of the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016